Collecting Graphical Abstract Views 
of Mercury Program Executions * 

Erwan Jahier 
IRISA, Campus Universitaire de Beaulieu, 

F-35042 RENNES Cedex - France 
jahier@irisa.fr, www.irisa.fr/lande/jahier 



Abstract 



A program execution monitor is a program that collects and ab- 
stracts information about program executions. The collect operator 
is a high level, general purpose primitive which lets users implement 
their own monitors. Collect is built on top of the Mercury trace. In 
previous work, we have demonstrated how this operator can be used 
to efficiently collect various kinds of statistics about Mercury program 
executions. In this article we further demonstrate the expressive power 
and effectiveness of collect by providing more monitor examples. In 
particular, we show how to implement monitors that generate graphi- 
cal abstractions of program executions such as proof trees, control flow 
graphs and dynamic call graphs. We show how those abstractions can 
be easily modified and adapted, since those monitors only require sev- 
eral dozens of lines of code. Those abstractions are intended to serve 
as front-ends of software visualization tools. Although collect is cur- 
rently implemented on top of the Mercury trace, none of its underlying 
concepts depend of Mercury and it can be implemented on top of any 
tracer for any programming language. 



*In M. Ducasse (ed), proceedings of the Fourth International Workshop on Automated 
Debugging (AADEBUG 200 ), August 2000 , Munich. COmput er Research Re pository 
flhttp: 7/ www.acm.org/cOTr71 ) , |cs.SE/00100"38| ; whole proceedings: |cs.SE/0010035| . 
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1 Introduction 

A program execution monitor is a program that collects and abstracts infor- 
mation about program executions. The monitoring functionalities of existing 
systems are built on top of ad hoc instrumentations. Most of them are imple- 
mented by subtle modifications of the runtime system; therefore, implement- 
ing such monitors require an in-depth knowledge of the system. The best 
people to implement these instrumentations are generally the implementors 
of the compiler. They, however, cannot decide which data to gather. Indeed, 
hundreds of variants can be useful and only end-users know what they want. 

Collect is a high level, general purpose, foldl-hke operator which lets users 
implement their own monitors. We have demonstrated in [llj how this oper- 



ator can be used to collect various kinds of statistics about Mercury program 
executions such as counting the number of predicate calls, the number of 
events for each event type (port), or the number of events at each depth. 
We have also showed how it is possible to perform test coverage ratio; this 
information is useful to assess the quality of a test set. The aims of these 
examples were twofold: demonstrating the expressive power of collect and 
the efficiency of the resulting monitors. In this article, we propose more 
program abstractions implemented with collect. The goal is to further as- 
sess the expressive power of collect in a pragmatic way by implementing a 
wider range of monitors and to check that the resulting monitors are still 
reasonably efficient. The monitors described in this article requires slightly 
more programming effort than the ones in In particular, we show how 



to implement monitors that generate graphical abstractions of program exe- 
cutions such as proof trees, control flow graphs and dynamic call graphs. We 
show how those abstractions can be modified and adapted. We believe that 
collect could be the basis of software visualization tools . This article also 
aims to be a tutorial about how to implement (Mercury) program monitors 
with collect. 

All the monitors given in this article are run under the Morphine trace 
analysis system. Morphine [T3] is a Prolog interpreter enhanced with primi- 



tives (collect included) to communicate with a Mercury program's execution. 
Morphine is a fully programmable command line interface for interactively 
monitoring and debugging Mercury program executions. The use of collect is 
independent of other Morphine concepts though; we only use the Prolog part 
of Morphine to post-process monitor's results. All the examples are verbosely 
paraphrased so no knowledge about Mercury or Prolog should be required to 
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understand them. Section ^| gives a quick overview of the language and the 
trace system of Mercury, as well as an informal presentation of collect. Sec- 
tions |3.1| , |3.2| , and |3.3| show how to implement monitors that generate control 
flow graphs, dynamic call graphs and proof trees respectively. Section f| de- 
scribes how monitors can be merged. Section || discusses performance issues 
and Section ^| related work. 



2 Flexible and efficient monitoring of Mer- 
cury 



The collect operator |Tl| is a generic primitive designed to let users implement 
easily efficient Mercury program monitors. In this section, we give a brief 
review of the Mercury programming language and of the collect monitoring 
operator. 



2.1 A brief introduction to Mercury 

Mercury |l9j is a purely declarative, logical and functional programming lan- 
guage. Its syntax is very similar to the one of Prolog. The main difference 
with Prolog is that users must declare the type, the mode and the deter- 
minism of predicates (and functions) they define. These declarations let the 
Mercury compiler produce very robust and efficient code. 



:- pred queen(list (int) : : in, list (int) :: out) is nondet . 
queen(Data, Out) :- 

qperm(Data, Out) , 

safe (Out) . 

Figure 1: The Mercury predicate queen/2 



Figure |2J] shows an example of Mercury code. It is a predicate of a Mer- 
cury program that solves the well known n queens problem. This program 
is given in Appendix 1. The first line is the type and mode declaration of 
predicate queen/20. This line states that the first argument of queen/2 is 
a list of integers and this argument is an input (in). It also states that its 



queen/2 denotes a predicate named queen of arity 2. 
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second argument is a list of integers and it is an output (out). The nondet 
determinism marker means that this predicate can have any number of so- 
lutions. Actually, it has two solutions. The list of integers codes the board. 
The predicate qperm/2 generates all the possible configurations by producing 
all the permutations of the list of integers. Then safe/2 checks that a given 
configuration is a solution, namely, that there is no more than one queen per 
diagonal. 



The Mercury trace A trace is a sequence of events. An event is a tuple 
of event attributes. An event attribute is an elementary piece of informa- 
tion that can be extracted from the current state of particular points of the 
program execution. The program points of the Mercury trace are chosen 
according to a trace model that is called Byrd's box model [0 : a call event is 
generated when a predicate is called; a exit event is generated if it succeeds; 
a fail event is generated if it fails; a redo event is generated if the execution 
backtracks on a predicate to see if it can produce other solutions. Actu- 
ally, the Mercury trace is an extended version of Byrd's box model because 
events are also generated when the execution enters a branch of a disjunc- 
tion or of an if-then-else. In the following, theses events are called internal 
and the Byrd's event are called external. The complete list of Mercury event 
attributes is given in Appendix 2. 



2.2 The collect monitoring operator 

Debugging and monitoring can be seen as a list of events processing activity. 
The standard functional programming operator foldl encapsulates a simple 
pattern of recursion for processing listsQ. As demonstrated by Hutton \W\. 



foldl has a great expressive power for processing lists. Therefore foldl is likely 
to be a good abstraction to implement dynamic analysis tools. 

However, implementing monitors by collecting the whole execution trace 
into a list of events, and then applying a foldl to that list would be far too 
inefficient. It would require to create and process a list of possibly millions 
of events. To implement efficient monitors, runtime information needs to be 



collected and analyzed on the fly. The primitive collect [11] is a foldl operator 



2 The foldl operator takes as argument a function, a list, and an initial value of an 
accumulator; it outputs the final value of the accumulator; this final value is obtained by 
applying to the function the current value of the accumulator and each element of the list 
successively; the I at the end of foldl means that this list is processed from left to rigth. 
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which is directly plugged into the trace system. First, a global variable is 
created and initialized. Then, whenever an event occurs, the collect interface 
is called instead of the standard debugger. The collect interface calls the 
filtering predicate which updates the global variable and then gives control 
back to the execution. 

It is important to note here that for performance reasons, there is no 
coroutining between different Operating System (OS) processes (the program 
and its monitor) but only procedure calls that update a global variable. This 
design decision was made to avoid those expensive OS level context switches 
induced by coroutining that the collect operator was initially designed (see 
related work section). 

For the time being, the only implementation of collect we have is done 
on top of the Mercury trace. The trace component of the Mercury sys- 
tem has been extended so that it is able to call the collect interface rather 



than the Mercury debuggers ||18|| . To implement monitors using collect, 
users need to give an initial value to the accumulator by defining a Mer- 
cury predicate named initialize/1, and to update the accumulator at each 
event by defining a Mercury predicate named filter/4. Since Mercury is a 
typed language, users also need to define the type of the collecting variable 
collected_type. 



7, 1 - Importation of Mercury library modules: 
:- import_module < Mercury modules >. 

7o 2 - Definition the type of the collecting variable: 
:- type collected_type == < A Mercury type >. 

% 3 - Initialization of the collecting variable: 
initialize (Accumulator) :- 

< Mercury goals which initialize the collecting variable > . 

% 4 - Updating of the global variable: 

filter (Event , Accumulatorln, AccumulatorOut , StopFlag) :- 

< Mercury goals which update the collecting variable > . 

Figure 2: What the user needs to type to define a monitor with collect 



This is summed up in Figure 0; predicates initialize/1 and filter/4 
should follow the following Mercury declarations: 
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:- mode initialize (collectecLtype :: out) is det. 
:- pred filter (event :: in, collected_type : : in, 

collected_type : : out , stop_or_continue : : in) is det. 

The type event is a structure which contains all the Mercury event attributes. 
To access those attributes, the monitor designer can use attribute accessor 
functions whose prototypes are of the form: 

:- func <attribute_name> (event :: in) = <attribute_type> : : out . 

For example, the function depth (Event) returns the depth of the event Event 
(the full list of attribute_name is given in Appendix 2). The fourth argu- 
ment of filter/4 is a binary flag that can be set to stop or continue 
depending on whether or not one wants to stop the monitoring process be- 
fore the end of the execution is reached. The current front-end of collect is a 
Prolog interpreter. Having a full programming environment is very useful to 
post-process the results of the monitor. If a file called my_monitor contains 
a definition of initialize/1 and filter/4, then the call collect (queens , 
my .monitor, Result) binds Result with the result of the monitor specified 
in my .monitor to the Mercury program queens. For example, Figure |2| con- 
tains the full code of a monitor that counts the number of predicate calls. 
If this code is in a file called count_call, then the query collect (queens , 
count_call , Result) will bind Result to the number of predicate calls that 
occurs during the execution of the program queens. 



:- import_module int. 

:- type collectecLtype == int. 

initialize (0) . 

filter (Event , CptO, Cpt , continue) :- 
( if port (Event) = call then 
Cpt = CptO + 1 

else 

Cpt = CptO ) . 

Figure 3: count_call: a monitor that counts the number of predicate calls 



Here is a line by line description of the code of Figure 0. To do that, 
here and in the following descriptions of monitors defined in this article, we 
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successively describe each of the four points that need to be addressed to 
define a monitor using collect. (1) Importing necessary Mercury modules: 
here, we only need to import the library module int that defines everything 
that is concerned with integers. (2) Defining the type of the collecting vari- 
able: here, it is an integer. (3) Initializing the collecting variable: here, it is 
initialized to 0. (4) Defining the filtering predicate: here, it increments the 
global variable whenever the current event port attribute is call. 



mmc 



count_call 




Figure 4: the various involved components and their relation when the user 
invoke the command "collect (queens , count_call, Result)". 



Figure |] shows the various involved components and their relation when 
the user invoke the command "collect (queens , count_call, Result)." 
at the Morphine prompt. Square boxes represent source files. Circle boxes 
represent object and executable files. The file containing the definition of 
the monitor, count_call, is transformed by Morphine into a Mercury mod- 
ule. The arrows labeled with mmc denote a call to the Melbourne Mercury 
compiler. The compilation of count_call.m and queens. m is only done if 
necessary, i.e., if something has changed in the source code since the last time 
it was compiled. The executable file queens-count_call is obtained by dy- 
namically linking the executable file queens and the objet file count_call . o. 
The output of this program, 157, is unified with the logical variable Result 
at the Morphine prompt. 



3 Collecting graphical abstract views 

In this section, we show how to generate several kinds of program execution 
abstract views based on graphs. All the graphs of this section are visualized 
with the dot drawing tool [16|]. The final objective is to have more sophisti- 



cated back-ends such as the visualization tools presented in [2C]. The point 
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of this section is not to provide an exhaustive set of graphical abstractions. 
The point is to show how easy it can be to implement them and, more in- 
terestingly, how easy it is to get a variant of an existing abstraction. Indeed, 
different visualization tools often need slightly different images of the ex- 
ecution. The full code that implements the production of these graphs is 
available in the controljlow scenario of the Morphine distribution^. 

3.1 Control flow graphs 



We define the control flow graph of a logic program execution as the 
directed graph where: nodes are predicates of the program; and arcs indicate 
that the program control passed from the origin to the destination node 
during the execution. Control flow graphs are useful execution abstractions 
for users to understand what a program actually do. They are also the basis 
of a lot of dynamic analyses. The control flow graph of the n queens program 
is given in Figure [5[ We can see that, during the program execution, the 
control passed from predicate main/2 to predicate data/1, from predicate 
data/1 to predicate data/1 (recursive call) and predicate queen/2, etc. An 
implementation of a monitor that performs such a graph is given in Figure |6|. 

3 http://www. irisa.fr/lande/jahier/download. html 




Figure 5: The control flow graph of n queens program 
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I : - import_module set . 

2 

3 :- type predicate > proc_name/arity . 

4 :- type arc > arc (predicate, predicate). 

5 :- type graph == set (arc) . 

6 :- type collected_type > collected_type (predicate , graph). 

7 

8 initialize(collected_type("user"/2, set init)). 

9 

10 filter (Event , AccO, Acc, continue) :- 

II Port = port (Event), 

12 ( if (Port = call ; Port = exit ; Port = fail ; Port = redo) then 

13 AccO = collected_type(PreviousPred, GraphO) , 

14 CurrentPred = proc_name (Event) / proc_arity (Event ) , 

15 Arc = arc (PreviousPred, CurrentPred), 

16 set insert (GraphO , Arc, Graph), 

17 Acc = collected_type (CurrentPred, Graph) 

18 else 

19 % internal events 

20 Acc = AccO 

21 ). 



Figure 6: Monitor that calculates control flow graphs 
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Monitor of Figure |6] is denned as follows. (1) The set module of the Mercury 
library is imported. (2) Graphs are encoded by a set of arcs, and arcs are 
terms made up with two predicates. The collecting variable is made of a 
predicate and a graph. The predicate is used to remember the previously 
visited node. (3) The collecting variable is initialized with the predicate 
main/2, the top level predicate of every Mercury program, and the empty 

graph (set init/0). (4) For every external event, we insert in the graph 

(set insert/3) an arc from the previous predicate to the current one. 

When the program execution has terminated, we post-process the result with 
dot a system that takes a graph description and that displays a pretty 
post-scripted version of it. Figure || is the output of such a post-processed 
result with the monitor of Figure |5]. This post-processing stage only requires 
a few dozen lines of Prolog code. In our definition of control flow graph, 
the number of times each arc is traversed is not given. Even if the control 
passed between two nodes more than once, only one arc is represented. One 
can imagine variants where, for example, arcs are labeled by a chronolog- 
ical counter. The corresponding monitor can be implemented by replacing 
arc (predicate , predicate) by arc (predicate , chrono, predicate) in 
the type definition of arc, and replacing the goal Arc = arc (PreviousPred, 
CurrentPred) by Arc = arc (PreviousPred , chrono (Event) , CurrentPred) 
in the definition of filter/4. 



3.2 Dynamic call graphs 

We define the dynamic call graph of a logic program execution as the sub- 
graph of the (static) call graph composed of the nodes that has been exercised 
during the execution. In other words, it is an execution slice of the program 
call graph. For example, we can see that predicate main/2 called predicates 
data/1, queen/2 and print_list/2. An implementation of a monitor that 
performs this graph is given in Figure 0. 

Here is a line by line description of the code of Figure ||. (1) Library 
modules set and stack are imported. (2) In order to define this monitor, 
we use the same data structures as for the previous monitor, except that 
we replace last exercised predicate by the whole call stack in the collected 
variable type. This call stack is computed on the fly. (3) The stack and the set 
of arcs are initialized to the empty stack and to the empty set respectively. (4) 
At call events, we insert an arc from the previous predicate to the current one 
and we push (function stack push/2) the current predicate on the stack. 
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Figure 7: The dynamic call graph of n queens program 



At redo events, we only update the call stack by pushing on it the current 
predicate. At fail and exit events, we remove the top element of the stack 

(predicate stack pop_det/3). The post-processed result of the execution 

of this monitor on the n queens program is given in Figure [7]. 

In the current implementation of collect, the call stack is not passed as an 
event attribute. The reason for that is that the call stack can be very large, 
which would slow down the collect monitors. Another reason is that, as 
demonstrated in this example, it is very easy to reconstruct this information. 
It is also interesting to note that the impact on the performance of handling 
the stack on the fly as we do here is hardly noticeable. 

3.3 Proof Trees 

Another widely used program abstract view in the logic programming com- 
munity are proof trees. A proof tree of a program execution is the dynamic 
call graph where all the fail nodes are omitted. Thus, for example, a failing 
request produces an empty proof tree. We do not give the code of the mon- 
itor that implements the proof tree, but rather briefly explains how it can 
be done with collect. The idea is to maintain a table of proof trees and a 
table of goal immediate successors both indexed by goal numbers (each goal 
is uniquely defined by its goal number). When predicates successfully exit, 
we construct the proof tree of the current goal. We can do that because at 
that port, we have the whole list of the current goal immediate successors 
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I :- import_module set, stack. 

2 

3 :- type predicate > p(proc_name , arity) . 

4 :- type arc > arc (predicate, predicate). 

5 :- type graph == set (arc) . 

6 :- type collected_type > ct (stack(predicate) , graph). 

7 

8 initialize(ct(Stack, set init)) :- 

9 stack push(stack init, pO'user", 0), Stack). 

10 

II filter (Event , ct(StackO, GraphO) , Acc, continue) :- 

12 Port = port (Event), 

13 CurrentPred = p (proc_name (Event ) , proc_arity (Event) ) , 

14 update_call_stack(Port , CurrentPred, StackO, Stack), 

15 ( ( Port = call ) -> 

16 PreviousPred = stack top_det (StackO) , 

17 set insert (GraphO , arc (PreviousPred, CurrentPred), Graph), 

18 Acc = ct (Stack, Graph) 

19 ; 

20 Acc = ct (Stack, GraphO) ). 
21 

22 :- pred update_call_stack(trace_port_type : : in, predicate :: in, 

23 stack(predicate) : : in, stack(predicate) : : out) is det . 

24 update_call_stack(Port , CurrentPred, StackO, Stack) :- 

25 ( ( Port = call ; Port = redo ) -> 

26 stack push (StackO, CurrentPred, Stack) 

27 ; ( Port = fail ; Port = exit ) -> 

28 stack pop_det (StackO , _, Stack) 

29 ; °/ internal ports 

30 Stack = StackO ) . 



Figure 8: Monitor that calculates dynamic call graphs 
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and we know the proof trees of each of these successors. At redo ports, those 
proof trees are removed from the table of proof trees. In order to calculate 
the list of immediate successors, we also need to maintain a stack of goal 
numbers, in exactly the same manner as in the two previous monitors. 

The post-processed result of the execution of this monitor on the n queens 
program is given in Figure |9|. It is also possible to construct SLD-trees with 
the same kinds of monitors, but we lack the necessary space to describe it 
hereQ. We have not included the predicate arguments in the graphs node, but 
that could have easily been done. We could add all the event attributes in 
the graph nodes actually. But then we would really need a visualization tool 
back-end that, for example, would display all those informations on request 
by clicking on nodes. 




Figure 9: The proof tree of n queens program 



4 the source code of all those monitors, included SLD-tr ee, is available at the Morphine 
web site http://www.irisa.fr/landc/jahicr/download.html 
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4 Merging monitors 



:- type collected_type > type_i. 

initialize (Ci) :- initialize_i (Ci) . 

f i Iter (Event , Ciln, CiOut, continue) :- filter_i (Event , Ciln, CiOut). 
Figure 10: A set of monitor indexed by i in{l, ...,n} 



A nice property of collect is that it implements monitors that can easily 
be merged. All the monitors can therefore collect their data in only one 
program execution. Indeed, consider the n monitors of Figure |4] where: i is 
in {1, n}; type_i is an arbitrary Mercury type; initialize_i a predicate 
that outputs a term of type type_i; and f ilter_i a predicate that takes 
an event, a term of type type_i and outputs a term of type type_i. We 
suppose that there are no name clashes between Ci, Ciln and CiOut. Then 
all those monitors can be merged as shown in Figure |j. 



:- type collected_type > union(type_l , type_n) . 

initialize (CollectVar) :- 

CollectVar = union(Cl, Cn) , 

initialized (CI) , 

initialize_n(Cn) . 

filter (Event , CollectVarln, CollectVarOut , continue) :- 
CollectVarln = union(CHn, . . . , Cnln) , 
filter_l (Event, Clin, CIDut) , 

f ilter_n (Event , Cnln, CnOut) , 
CollectVarOut = union(C10ut, CnOut) . 

Figure 11: The monitor obtained by merging the n monitors of Figure ^ 



The collecting variable type of the resulting monitor is a functor with 
arguments the n monitors collecting variable types. The initialization (resp. 
filtering) predicate successively initializes (resp. updates) each collecting 
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variable using the initialization (resp. filtering) predicate of the sub-monitors. 
This can easily be done automatically. 

5 Performance issues 

We have seen in the previous sections how it is possible for users, without any 
knowledge about the Mercury trace system, to implement their own monitors. 
This genericity has as price: efficiency In this section, we try to exhaustively 
examine the source of performance overheads of our approach compared with 
hand-crafted ad-hoc monitors implemented inside the compiler. 

Granularity of the instrumentation The principal source of overhead 
is due to the fact that not all the monitors need such a fine grained in- 
strumentation as the trace system have. The only control we have over the 
granularity of the instrumentation is the possibility to generate only exter- 
nal events when compiling programs. In order to assess this overhead, we 
have compared the execution times of Mercury programs executed normally 
with programs executed within the control of the trace system. We mean by 
executed under the control of the trace system that, at each event, the trace 
system is called and then simply returns. We have measured a slowdown of 
around a factor of two if internal events are not generated, and a factor of 
four if they are generated. 

Unused event attributes A second source of overhead is due to the fact 
that we systematically pass to filter all event attributes, even if they are not 
used. Actually, this is not really a problem since it is possible to dynamically 
choose the event attributes that are available in the event structure. In the 
current implementation of collect, it is already the case for the list of live 
arguments. Indeed, since this attribute can be very large, we want to avoid 
the cost of handling it when it is not necessary. To assess this source of 
overhead for the other attributes, we have measured an implementation of 
collect that handles all the event attributes versus a version that handles none 
of them (leading to monitors that can not do anything useful but counting 
the number of events). The slowdown was smaller than 10%. 

Scaling up the approach Performance problems might occur when the 
monitoring data become very large. A possible solution is to bufferize the 
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data collecting by sending intermediate data to the Morphine process every 
N events. This is possible thanks to the collect fourth argument flag, which 
allows to stop the monitoring process at any events. Then Morphine, which 
is based on a full existing Prolog programming environment, can manage the 
analysis of the collected data and then start a new collect call to finish the 
execution monitoring. Moreover, on a two processors machine, we can even 
process those partially collected data asynchronously and thus perform the 
collecting and the analysis steps in parallel. 



6 Related Work 

Programmable debuggers Ducasse designed Coca Q and Opium ||, 
programmable debuggers for C and Prolog respectively. Coca and Opium 
are based on a Prolog interpreter plus an handful of coroutining primitives 
connected to the trace system. Those primitives allow the Prolog interpreter 
to communicate with the debugged program. Coca and Opium are full de- 
bugging programming languages in which all classical debuggers commands 
can be implemented straightforwardly. However, it appears that the set of 
coroutining primitives of Coca and Opium are not well suited for monitoring. 
All the monitors implemented with collect can easily be implemented with 
this set of primitives, but the resulting monitors require too much Operating 
System level context switches and too much socket traffic between the pro- 
gram and the monitor. With program of several million of execution events, 
such monitors are several orders of magnitude slower than their counterparts 
that use collect |L3| . 



The collect primitive does not only extend the Coca/Opium primitives 
in terms of efficiency, but also in terms of expressiveness. Using the existing 
primitives to implement monitors, one typically duplicate the code that (1) 
makes the execution move one event forward, (2) checks if the end of the 
execution is reached. Those two steps are automatically done when using 
collect. In other words, the expressiveness improvement between collect and 
the existing primitives of Coca/Opium is the same as the improvement we 
have between using foldl and processing lists manually. As a matter of fact, 
collect can be seen as a generalization of Opium/Coca coroutining primitives 
since they can all be implemented with it. 
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Automated development of monitors Jeffery and al. designed Alamo [14 
an architecture that aims at easing the development of monitors for C pro- 
grams. As in our approach, their monitoring architecture is based on events 
filtering and monitors can be programmed. Their system deals with trace 
extraction whereas we rely on an already available tracer; this saves us from 
a low level and tedious task which has already been done and optimized. On 
the other hand, we do not have the control on the information available in 
the trace. Note however that lacking information can sometimes be recov- 
ered as we did for example to perform the call stack. Moreover, to avoid 
code explosion, they perform part of the events filtering at compilation time. 
Hence they need to recompile the program each time they want to execute 
another monitor whereas we only need to dynamically link the monitor to 
the monitored program. Alamo and the monitored program are running in 
coroutining, but within the same address space. Alamo has therefore less 
problems of efficiency than Coca and Opium for monitoring. Eustace and 
Srivastava developed Atom ||, a system that also aims at easing the building 
of monitors. The difference with Alamo is that monitors are implemented 
with procedure calls and global variables which is much more efficient than 
coroutining. However, the language Atom provide is far less expressive than 
the one of Alamo. Alamo and Atom have influenced the design of collect 
and we tried to take the best of both; a full and high level programming 
language implemented by procedure calls. 

Kishon and al. [[OJ use a denotational and operational continuation se- 
mantics to formally define monitors for a simple functional programming 
language. The kind of monitors they define are profilers, debuggers, and 
statistic collectors. From the operational semantics, a formal description of 
the monitor, and a program, they derive an instrumented executable file that 
performs the specified monitoring activity. The semantics of the original pro- 
grams is preserved. Then, they use partial evaluation to make their monitors 
reasonably efficient. The main disadvantage with this approach is that they 
are rebuilding a whole execution system from scratch, without taking advan- 
tage of the existing compiler. We strongly believe that it is important to 
have the same execution system for debugging and for producing the final 
executable program. As noted by Brooks and al. HJ], some errors only occur 
in presence of optimisations, and vice versa; some programs can only be exe- 
cuted in their optimized form because of time and memory constraints; when 
searching for "hot spots" , it is better to do it with the optimized program as 
lots of things can be optimized away; and finally, sometimes, the error comes 
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from the optimisations themselves. 



Efficient monitoring Patil and Fisher |L7j tackles the problem of perfor- 
mance monitoring by delegating the monitoring activities to a second pro- 
cessor that they call a shadow processor. Their approach is very efficient; 
the monitored is nearly not slowed down. However, the set of monitoring 
commands they propose cannot be extended. 



Invariant Detection Explicitly stated program invariants can help users 
to identify program properties which must be preserved when modifying code. 
Invariant discovery is generally done statically ||. However, static analyses 
miss true but uncomputable properties and properties that depend of the pro- 
gram inputs. Ernst and al. investigate a complementary approach that 
consists of dynamically detecting program invariants. The idea is to run in- 
strumented versions of programs on a sufficiently large set of test cases, and 
then examine the values they compute, looking for patterns and relationships 
among those values. Useless invariants are filtered 0. A prototype imple- 
mentation, Daikon, demonstrates the feasibility of this approach. Despite its 
intrinsic unsoundness, Ernst and al. report that dynamic invariant discovery 
can be very useful in practice. We believe that it could be an interesting 
application of collect. 



7 Conclusion 

For a given monitor, provided that the whole necessary information is in the 
trace, (1) is it always possible to implement a given monitor? (2) is it always 
easy to implement it? (3) is it always possible to implement it efficiently! 
Since it is possible to collect the whole execution trace, the answer to the 
first point is yes. This would be the most inefficient way of implementing 
monitors though as the trace can be huge. The second point is more difficult 
to assess. However, we believe that collect has the rigth level of abstraction 
to allow that. Processing a trace with collect is the same as processing a list 
with a foldl operator; and the foldl operator is very expressive, as argued by 
Hutton [[UJ. Indeed, processing lists using a foldl operator has significant 
advantages over processing lists manually. This article contributes to give an 
experimental assessment of the second point by demonstrating how easy a 
wide range of monitors can easily be implemented with collect. With regards 
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to the third point, the measurements we made let us believe that the cost 
of monitors implemented with collect is acceptable. We believe that all the 
monitors implemented with collect executes in the same order of magnitude 
of time as their hand-crafted counterparts. 

The choice of Mercury to implement initialize and filter is arbitrary. 
The reasons to use Mercury in this context is that people who want to monitor 
Mercury programs are very likely to be Mercury users. Moreover, since 
filter will be called possibly millions of times, it makes sense to use a 
highly optimized compiler such as the one of Mercury. 
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Appendix 1 - The n-queens program in Mer- 
cury 



:- module queens. qdelete(A, [A I L] , L). 

qdelete(X, [A I Z] , [A I R] ) :■ 
:- interface. qdelete(X, Z, R) . 

:- import_module io. safe([]). 

safe ([NIL]) :- 

:- pred main(io state, io state). nodiag(N, 1, L), 

:- mode main(di, uo) is cc_multi. safe(L). 

:- implementation. nodiag(_, _, [] ) . 

nodiag(B, D, [N I L] ) :- 
:- import_module list, int. NmB is N - B, 

BmN is B - N, 

main — > ( D = NmB -> 

( { data (Data) , queen(Data, Out) } -> fail 

print_list(Out) ; D = BmN -> 

; fail 

io write_string( "No solution\n") ; 

) . true 

), 

:- pred data(list (int) ) . Dl is D + 1, 

:- mode data(out) is det . nodiag(B, Dl, L). 



:- pred queendist (int) , list(int)). 
:- mode queen(in, out) is nondet . 

:- pred qperm(list (T) , list(T)). 
:- mode qperm(in, out) is nondet. 

:- pred qdelete(T, list(T), list(T)). 
:- mode qdelete(out, in, out) is nondet. 

:- pred safe (list (int) ) . 

:- mode safe (in) is semidet. 

:- pred nodiag(int, int, list(int)). 
:- mode nodiag(in, in, in) is semidet. 

data([l,2,3,4,5]) . 



:- mode print_list (list (int) : : in, 

io state: :di, 

io state: :uo) is det. 

print_list(Xs) — > 
( { Xs = [] } -> 

io write_string(" [] \n") 

io write_string(" [") , 

print_list_2 (Xs) , 

io write_string("] \n") 

). 

:- mode print_list_2 (list (int) : : in, 

io state : : di , 

io state: :uo) is det. 



queen(Data, Out) :- 
qperm(Data, Out) , 
safe (Out) . 

qperm( [] , [] ) . 
qperm([X|Y], K) :- 

qdelete(U, [X|Y] , Z) , 

K = [UlV] , 

qperm(Z, V) . 



print_list_2( [] ) — > [] . 

print_list_2( [XlXs] ) — > 

io write_int(X) , 

( { Xs = [] } -> 
[] 



). 



io write_string(" , "), 

print_list_2(Xs) 
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Appendix 2 - The Mercury trace 

There are three kinds of attributes: attributes containing information relative 
to the control-flow (numbered from 1 to 6 in the following), to the data-flow 
(7 and 8) as well as information relative to the source code (9 and 10). The 
different attributes provided by the Mercury tracer are listed below. 

1. Event number (chrono). It is the rank of the event in the trace. 

2. Goal invocation number (call). 

3. Execution depth (depth) . 

4. Event type or port (port). There are the 4 traditional ports call, 
exit, fail and redo introduced by Byrd [|J for Prolog. Mercury also 
generates internal events describing what occurs inside a call: an event 
of type disj is generated each time the execution enters a branch of 
a disjunction, of type switchP] if this disjunction is a switch, of type 
then if it is the "then" branch of a if-then-else and of type else if it is 
the "else" branch. 

5. Determinism (deter) . It characterizes the number of potential solu- 
tions for each procedure. The different determinism markers are de- 
scribed in section [2.1| . 

6. Procedure (proc). It is characterized by: a flag indicating if the current 
procedure is a function or a predicate (proc_type), a module name 
(module), a procedure name (proc_name), an arity (arity) and a 
mode number [] (mode_num) . 

7. List of live arguments (arg). A variable is said to be live at a given 
point of the execution if its instantiation is still available. 

8. List of local live variables (local_var). It is the live variables that are 
not arguments of current procedure. 

5 A switch is a disjunction in which each branch unifies a ground variable with a different 
function symbol. In that case, at most one disjunction will provide a solution 

6 The mode number encodes the mode of a predicate or a function: when a predicate 
has one mode, this number is 0. If not, this number corresponds to the rank of appearance 
in the code of the mode declaration; 1 for the first, 2 for the second, etc. 
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9. Goal Path (goal_path). It is a list indicating in which branch of the 
code the current event takes place. The branches then and else of a 
if-then-else are represented by t and e respectively; the conjunctions, 
disjunctions and the switches are represented by ci, di and si respec- 
tively, where i is the number of the conjunction, disjunction, or the 
switch. For example, an event whose path is [c3;e;dl] corresponds 
to an event which occurs in the first branch of a disjunction, which is 
itself part of an else branch of an if-then-else, which is in the third con- 
junction of the current procedure. For efficiency reasons, this attribute 
is only available at internal events. 

10. Ancestor stack (ancestors). 



A more detailed description of the contents of the Mercury trace is made 
in the Mercury language reference manual || and in the user's manual of 
Morphine [12 . 



